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REMARKS 

The claims are 1-20. Claims 1, 3, 4, 7, 14, 16 and 18 have been amended. 
Claims 1, 3, 7, 14, 16 and 18 are in independent form. Favorable reconsideration and 
allowance of the subject application are respectfully requested in view of the following 
comments. 

Claims 1, 3, 7, 14, 16 and 18 have been amended to clarify that the energy bar 
claimed has about 2 to about 55 g of carbohydrates, about 1 to about 4.5 g of fortification 
components, about 5 to about 40 g of protein, about 2 to about 10 g of fat, about 150 to about 
300 calories, and a moisture content of less than about 15% by weight, based on a 55 g 
serving size. Support for the amendment can be found, for example, in paragraph [0016] on 
pages 4-5, and paragraph [0042] on page 12 of the specification. 

Claim 4 has been amended to correct a minor error. 

Claims 1-3, 7, 15, 17, 19 and 20 stand rejected under 35 U.S.C. § 1 12, second 
paragraph, for allegedly being indefinite. Specifically, the Office Action has objected to the 
use of the terms "hedonic score," "confidence level," and "acceptability." Applicants 
respectfully direct the Examiner's attention to paragraphs [0020] and [0022] on pages 5 and 6 
of the specification, where the definitions for "hedonic score" and "confidence level" are 
provided. Moreover, Applicants note that the term "acceptability" is understood in the food 
industry to denote a consumer's willingness to eat a product. See Principles of Sensory 
Evaluation of Food, 1965, p. 278. Applicants also wish to point out that one skilled in the art 
understands that the hedonic score and confidence intervals are statistically determined 
measurements and are reproducible within a certain degree of error. Applicants respectfully 
direct the Examiner's attention to the following publications, which demonstrate the use of 
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these terms throughout the food industry: Sensory Analysis of Foods, pp. 250, 254-257, and 
366 1 ; Statistical Methods in Food and Consumer Research, pp. 7 and 8; and Principles of 
Sensory Evaluation of Food pp. 275-289. Copies of each are enclosed for the Examiner's 
convenience. Accordingly, Applicants respectfully request withdrawal of the Section 112 
rejections. 

Claims 1-20 have been provisionally rejected under the judicially created 
doctrine of obviousness-type double patenting as being unpatentable over claim 20 of 
copending Application No. 10/272,571 (the '571 application) and claims 1-20 of copending 
Application No. 10/271,710 (the '710 application). Applicants note that the '571 application 
was abandoned on September 15, 2004 for being non-responsive to the Office Action issued 
on June 15, 2004. As such, the provisional rejection based on the '571 application is rendered 
moot. Regarding the provisional rejection based upon the '710 application, a Terminal 
Disclaimer is submitted herewith. In light of the above comments, it is believed that the 
provisional double patenting rejections have been obviated, and their withdrawal is therefore 
respectfully requested. 

Claims 1-10 stand rejected under 35 U.S.C. § 102(b) as allegedly being 
anticipated by U.S. Patent No. 4,055,669 ("Kelly"). Claims 1-13 and 18-20 stand rejected 
under 35 U.S.C. § 103(a) as allegedly being obvious over Kelly in view of U.S. Patent No. 
6,592,915 ("Froseth") and a recipe for Pfeffernusse found in the book titled, Joy of Cooking 
("Rombauer"), on page 708. Claims 14-17 stand rejected under 35 U.S.C. § 102(b) as 
allegedly being anticipated by Rombauer. Applicants respectfully traverse these rejections, in 
view of the comments set forth below. 

1 The hedonic score may be based on a nine-point scale or seven-point scale. For purposes of the present 
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As amended, claim 1 is directed to an energy food bar that provides about 2 to 
about 55 g of carbohydrates, about 1 to about 4.5 g of fortification components, about 5 to 
about 40 g of protein, about 2 to about 10 g of fat, about 150 to about 300 calories, and a 
moisture content of less than about 15% by weight, based on a 55 g serving size. 

Kelly is directed to a high protein fat occluded food composition made of 
cereal particles and a binder. The binder includes a protein source coated with an edible fat, 
which masks the protein flavor, making the binder taste bland. 

Applicants have reviewed Kelly and have determined that the amount of fat in 
the food composition exceeds the permissible amount set forth in claim 1 of about 2 to about 
10 g of fat, based on a 55 g serving size. In column 2, lines 56-58, Kelly discloses that a 
binder composition makes up 60-70% of the food composition. Kelly further states in column 
3, lines 61-64, that "[t]he fat content of the binder composition ranges from a minimum of 
about 33% by weight to a maximum of about 85% by weight, preferably about 47% by 
weight[.]" Therefore, the minimum amount of fat present in the binder composition of Kelly 
can be calculated by multiplying the (percent binder) by the (percent fat in the binder) by the 
(serving size). For a 55 g serving, the minimum amount of fat present in the binder 
composition alone is 10.9 g of fat (55 g X (33% fat) X (60% binder)). Moreover, additional 
fat in the food composition of Kelly is found in the cereal components that make up the other 
40% of the food composition. Low fat cereal components such as crisp rice or corn flakes 
have about 0.5% fat. For a 55 g serving basis, this would amount to 0.1 g of fat (55 g X (0.5% 
fat) X (40% cereal)) in the cereal portion. The minimum total amount of fat in the food 
composition is therefore calculated to be 1 1 g of fat. This clearly exceeds the range of about 2 

invention, a seven-point scale was selected. 
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to about 10 grams of fat permitted in the energy food product set forth in claim 1 . As such, it 
is respectfully submitted that claim 1 is patentable over Kelly. 

Claim 2 directly depends from claim 1. For at least the same reasons discussed 
above in connection with claim 1, claim 2 is patentable over Kelly. 

Independent claims 3 and 7, as well their respective dependent claims, require 
that the energy bar have about 2 to about 55 g of carbohydrates, about 1 to about 4.5 g of 
fortification components, about 5 to about 40 g of protein, about 2 to about 10 g of fat, and 
about 150 to about 300 calories, and a moisture content of less than about 15% by weight, 
based on a 55 g serving size. As such, claims 3 and 7 and their respective dependent claims, 
are patentable over Kelly. 

Froseth discloses a layered cereal bar having identifiable ready to eat cereal 
pieces and at least one visible filling layer. The cereal bar has a total nutrient level equal to or 
greater than the nutrient level of a single serving of boxed cereal with milk. 

Froseth, however, does not disclose a cereal bar having about 1 to about 5 g of 
fortification components. In column 15, lines 17-25, Froseth, discloses an embodiment where 
the amount of tricalcium phosphate (TCP), i.e., mineral, in the binder is 3% on a weight basis. 
Froseth also discloses that the binder makes up 40% of the cereal bar {see column 11, lines 
1 5-16). For a 55 g serving basis, the amount of TCP in the cereal bar can be calculated to be 
0.66 g of TCP (55 g X (40% binder) X (3% TCP in binder)). Therefore, the cereal bar of 
Froseth does not fall within the fortification component range of about 1 to about 4.5 grams in 
the energy bar set forth in claim 1 . As such, the cereal bar of Froseth would not qualify as an 
energy bar. 



Rombauer is cited for disclosing a recipe for Pfeffernusse. The Office Action 
states that "an energy matrix made of com syrup which is combined with a solid component, 
grated lemon rind, which is mixed into a fat-carbohydrate matrix (butter and sugar)(page 708). 
The composition is considered to have a lubricious mouthfeel since the claimed ingredients 
are used." 

Applicants note, however, that Rombauer fails to meet the protein level 
required by the range of about 5 to about 40 g, set forth in claim 1 . The table below provides 
a breakdown of the ingredients used to make the Pfeffernusse composition. 



PFEFFERNUSSE 



Grams of Protein 
Ingredient (based on 55 g serving) 


Flour 


2.01 cups 


3.21 


Baking Powder 


0.75 tsp 




Baking Soda 


0.13 tsp 




Salt 


0.25 tsp 




Black Pepper 


0.25 tsp 


0.01 


Nutmeg 


0.25 tsp 


0.01 


Cinnamon 


1 tsp 


0.01 


Fennel Seed 


1 tsp 


0.05 


Butter 


0.5 cups 


0.03 


Sugar 


0.33 cup 




Egg 


1 


0.47 


Chopped Almonds 


0.25 cup 


0.82 


Chopped Citron 


1 tbsp 




Orange Peel 


0.25 cup 




Molasses 


0.33 cup 




Corn Syrup 


1 tbsp 




Brandy 


0.33 cup 




Lemon Rind 


1 tsp 




Lemon Juice 


1 tbsp 




TOTAL 




4.61 



Applicants have determined that the protein content in the Pfeffernusse 
composition is approximately 4.6 g. This does not fall within the protein range of about 5 to 
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about 40 g (based on a 55 serving), claimed in claim 1 . Moreover, the Pfeffernusse 
composition is not seen to include fortification components. As such the range of about 1 to 
about 4.5 g of fortification components, set forth in claim 1 is not met. Clearly, the 
Pfeffernusse composition of Rombauer, does not qualify as an energy bar. 

Applicants respectfully submit that Kelly, Froseth, and Rombauer, whether 
taken alone or in any permissible combination, do not disclose or suggest the presently 
claimed invention of an energy bar that provides about 2 to about 55 g of carbohydrates, about 
1 to about 4.5 g of fortification components, about 5 to about 40 g of protein, about 2 to about 
10 g of fat, about 150 to about 300 calories, and a moisture content of less than about 15% by 
weight, based on a 55 g serving size, as set forth in claim 1 . 

Claim 2 directly depends from claim 1 . For at least the same reasons discussed 
above in connection with claim 1, claim 2 is patentable over Kelly, Froseth, and Rombauer 
whether considered alone or in any permissible combination. 

Like claim 1, independent claims 3, 7 and 18 each require that the energy bar 
have about 2 to about 55 g of carbohydrates, about 1 to about 4.5 g of fortification 
components, about 5 to about 40 g of protein, about 2 to about 10 g of fat, and about 150 to 
about 300 calories, and a moisture content of less than about 15% by weight, based on a 55 g 
serving size. For at least the same reasons discussed above for claim 1, claims 3, 7 and 18 are 
patentable over Kelly, Froseth, and Rombauer, whether considered alone or in combination. 

Claim 14 is a product by process claim and claim 18 is a method claim, which 
require that the energy bar have about 2 to about 55 g of carbohydrates, about 1 to about 4.5 g 
of fortification components, about 5 to about 40 g of protein, about 2 to about 10 g of fat, and 
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about 150 to about 300 calories, and a moisture content of less than about 15% by weight, 
based on a 55 g serving size. 

As previously noted, the Rombauer Pfeffernusse composition has 
approximately 4.6 g of protein (based on a 55 g serving) and no fortification components. 
Therefore the Pfeffernusse composition does not meet the protein level of about 5 g to about 
40 g of protein, and the fortification level of about 1 to about 4.5 g, set forth in claims 14 and 
16. As such, claims 14 and 16 are patentable over Rombauer. 

Claim 15 depends from claim 14, and claim 17 depends from claim 16. Claims 
15 and 17 are also patentable over Rombauer for the same reasons discussed above for claims 
14 and 16. 

In view of the foregoing remarks, Applicants respectfully request favorable 
reconsideration and early passage to issue of the present application. 

Applicants' undersigned attorney may be reached in our New York office by 
telephone at (212) 218-2100. All correspondence should continue to be directed to our below 
listed address. 




Attorney for Applicants 
Victor Tsu 

Registration No. 46,185 



FITZPATRICK, CELLA, HARPER & SCINTO 
30 Rockefeller Plaza 
New York, NY 10112-3801 
Facsimile: (212)218-2200 



NY_MAIN 492806v1 
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In re Application of: \&, 

EDWARD L. RAPP ET AL 

Application No.: 10/615,249 

Filed: July 8, 2003 

For: TASTING ENERGY BAR 
(As Amended) 



Docket No. 02280.003720. 

Examiner: H. F. Pratt 
Group Art Unit: 1761 

Date: April 4, 2005 



THE COMMISSIONER FOR PATENTS 
P.O. Box 1450 
Alexandria, VA 22313-1450 

Sir: 

Transmitted herewith is an Amendment and a Terminal Disclaimer in the above-identified application. 



X] No additional fee is required. 



The fee has been calculated as shown below 



c 


LAIMS AS AMENDED 




(2) 
CLAIMS 
REMAINING 

AFTER 
AMENDMENT 




(4) 

HIGHEST NO. 
PREVIOUSLY 
PAID FOR 


(5) 
PRESENT 
EXTRA 


RATE 


ADDITIONAL 
FEE 


TOTAL 
CLAIMS 


* 

20 


MINUS 


** 

20 


0 


x$25 
$50 


0.00 


ESfDEP. 
CLAIMS 


* 

6 


MINUS 


*** 

6 


0 


x$100 
$200 


0.00 


Fee for Multiple Dependent claims $180°/$360 




TOTAL ADDITIONAL FEE 
FOR THIS AMENDMENT— 




0.00 



* If the entry in Column 2 is less than the entry in Column 4, write "0" in Column 5. 
** If the "Highest Number Previously Paid For" IN THIS SPACE is less than 20, write "20" in this space. 
*** If the "Highest Number Previously Paid For" IN THIS SPACE is less than 3, write "3" in this space. 
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I I Verified Statement claiming small entity status is enclosed, if not filed previously. 
I I A check in the amount of $ is enclosed. 

I I Charge $ to Deposit Account No. 06-1205. A duplicate copy of this sheet is enclosed. 

[~X] Any prior general authorization to charge an issue fee under 37 C.F.R. 1.18 to Deposit Account No. 06- 
1205 is hereby revoked. The Commissioner is hereby authorized to charge any additional fees under 
37 C.F.R. 1.16 and 1.17 which may be required during the entire pendency of this application, or to 
credit any overpayment, to Deposit Account No. 06-1205. A duplicate copy of this paper is enclosed. 

I 1 A check in the amount of $ to cover the fee for a month extension is 

enclosed. 



X] A check in the amount of $ 130.00 to cover the Terminal Disclaimer fee is enclosed. 



Applicants 1 undersigned attorney maybe reached in our New York office by telephone at (212) 21 8- 
2100. All correspondence should continue to be directed to our address given below. 




Attorney for Applicants 
Registration No.: 46,185 



FITZPATRICK, CELLA, HARPER & SCINTO 
30 Rockefeller Plaza 
New York, New York 101 12-3800 
Facsimile: (212)218-2200 
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Chapter 6 
Laboratory $tudies:Typc$ and Principles 



Foods arc submitted to sensory examination to provide information 
thai can load to product improvement quality maintenance, the de- 
veloprncnt of new products, or analysis of the market. This suction 
summart^ns the most important types of sensory problems encountered 
by food research groups and the main types of procedures used in 
solving thorn. This chapter covers the use? of laboratory panels, as do 
Chapters 7 mid 8. Consumer testing is discussed in Chapter 9, and sta- 
tistical procedures for evaluation of the results of both types of panels 
are covered in Chapter 10. 

Tests may be conducted to: (1) select qualified judges and study 
human perception of food attributes; (2) correlate sensory with chemical 
and physical measurements; (3) study processing effects, maintain qual- 
ity, evaluate raw material selection^ establish storage stability, or reduce 
costs; (4) evaluate quality; or (5) determine consumer reaction. Each 
of these purposes requires appropriate tests. In general, laboratory 
panels arc used for the first three purposes, highly trained experts for 
the fourth, and largo consumer groups for the last. 

In this text we distinguish between two types of laboratory panels: 
(1) those which determine simple differences between treated samples; 
and (2) those which determine directional differences. Both arcs lab- 
oratory panels, and sometimes untrained judges arc used, but it. is the 
thesis of this book that trained subjects are more useful. The advan- 
tages of such panels arc discussed in Chapter 7. 

I. Types of Tests 

The most important types of tests and their utilization are briefly 
described here. More detailed information of each procedure is given 
in Chapters 7, 8, and 10. 

A. DjOTJCUKNCti Tk$tb 

The common true difference tests arc referred to as single-stimulus, 
paired-stimuli, duo-trio, triangle, and multi-sample tests, In tests which 
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do not reveal statistically significant differences between treatments, no 
further cvuhiution is needed. When differences are found, however, direc- 
tional difference tests arc used to establish the nature and magnitude of 
difference. After a significant difference has been established by a 
laboratory panel, consumers may he asked to express preferences. 

Since most perceptual judgments are relative, single-sample prcscnta* 
Hon is used infrequently, except at the consumer level. Expert tasters of 
wines, beers, coffee, tea, and dairy products rate single samples, but they 
evaluate the quality of many samples at a time and compare them 
against their pre-established "memory standard/' Occasionally a method 
called "A-uot A'* Is used (Peryam, 1958), in which a standard, A, is 
presented followed by one or more coded samples. The judge indicates 
which onc(s) /.v (are) A. This method may be classified as a paired com- 
parison rather than single presentation since each coded sample is com- 
pared with the standard. 

In the paired-stimuli procedure, judges simply specify whether there 
is a difference between two samples. When the judge also indicates what 
sensory characteristic distinguishes the two samples, we speak of the test 
as a paired-comparison. The samples are presented in a counter-balanced 
design, and a forced-choice is usually required. One half of the responses 
could be correct due to chance alone. The number of samples tested at a 
single session will depend on the commodity, the experience of the 
judges, and the amount of time and sample Available. Paired testing is 
typically used in comparing now with old processing procedures, in 
quality control, and in preference testing at the consumer level. 

The duo-trio is a modified paired presentation in which one sample is 
identified and presented first, followed by two coded samples, one of 
which is identical with the standard The Judge is asked which of the two 
is the same as the first sample. This method is primarily a laboratory tool 
for use with trained subjects. It lends itself to use for quality control and 
for selection of judges of superior discrimination. 

In the triangle tast two identical and one different samples are pre- 
sented simultaneously and the judge is asked to indicate the odd sample. 
Correct identification due to chance alone is one third, Like the duo-trio 
method, the triangle test should be used only by trained laboratory 
judges, and is suited to similar problems. 

B. Rank Order 

Ranking is used to determine how several samples differ on the basis 
of a single characteristic. A group of coded samples (which may contain 
a control, or standard) arc presented simultaneously, and the judge is 
asked to rank them in order of the intensity of a specified characteristic 
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This method is suitable for use hy laboratory judges in product or process 
evaluation, by experts for selecting the best sample for a particular use, 
and by consumers for expressing relative acceptability among a limited 
number of samples. It is of importance that all judges use the same 
criteria. When necessary, one criterion (sweetness, for example) can be 
ranked, after which another criterion (sourness, viscosity, etc,) may be 
ranked in another set of the same samples. 

C. Sowing Trots 

The best use of scoring tests is in comparisons of a control sample 
with several experimental samples. The scoring may be expressed in 
terms of ^deviation from a rcfcrcncc—"no difference from control" to 
4 very large difference from control/' In oilier experiments, scores may be 
used on an absolute basis if the scale is clearly defined and understood 
by all judges. Although dJfferenee-from-control tests have been used 
widely by laboratory panels, the results may be meaningless if the judges 
change the basis of their scoring as the test proceeds, i.e., judges become 
experts. Thus, this method is best suited for use by experts. The test may 
be administered to consumers if it is clearly explained and the decisions 
required arc simple* 

Tests in which deviation from a control is measured are used for 
product or process evaluation and critical tests on basic perception of 
sensory attributes. Scoring tests arc* also used in new-product develop- 
rnent, quality control, storage stability tests, screening of intensity levels, 
and measuring judge characteristics such as leniency, reproducibility, or 
central tendency (see Chapter 5). 

D. Descriptive Tests 

Descriptive sensory analyses are best conducted only by highly 
trained experts completely familiar with the product or the process. Such 
tests are used effectively in new-product development, in product or 
process improvement, for quality control, and for training judges for 
future testing. One type of descriptive test— bedonic— in which degree of 
liking is described, is suitable at the consumer level. Among the types of 
descriptive tests currently in use are scalar scoring of various types, 
hedonic ratings, semantic differential tests, and Arthur P, Littles "Flavor 
Profile" (.sec Chapter 8, Section V), 

E. Hkdonxc Scaling 

Scoring is called hedonic when the judge expresses bis degree of 
liking by checking u point; on a scale ranging from extreme disapproval 
to extreme approval. A five- to nine-point balanced scale to usually cm- 
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ployed. Hodonlc ratings arc converted to scores and treated by rank 
analysis or analysis of variance. As indicated above, this test has been 
used both by experts and by untrained consumers, but we feel it is more 
effectively applicable to rim latter. 

P. Acceptance: and Pmsferenck 

Distinction should be made between acceptance, which is a willing- 
ness' to use or eat a product, and preference, which relates to a greater 
degree of acceptance of one product over another when a choice is 
presented. The acceptance or preferences of a laboratory panel arc of 
very limited value except in gross screening of treatments. Some of the 
test methods described above can be adapted to measurement of con- 
sumer reaction (sec Chapter 9). 

G. 0-jtiku MlOTJJOPS 

Dilution tests, described in Chapter 9, have been used for laboratory 
testing of selected treatments, employing methods of presentation de- 
scribed above, i.e., single, paired, and multiple samples, Threshold tests 
are seldom used except in studies when* it b desirable to establish the 
minimum delectable difference of an additive or of an off flavor. Thresh- 
old and dilution tests have* I Keen used to a limited extent to select judges 
who can detect specific sensory properties. When so used, the test mate- 
rials and their concentrations should be the same as those likely to be 
encountered in the actual test. Sequential analysis (Chapter .1.0) can 
be used to analyse the results. 

It is our belief that laboratory judges should be carefully selected arid 
screened on the basis of their sensitivity to the differences that may bo 
encountered in the experimental samples. In this .sense, all laboratory 
panels should consist of experts. It is recognized that in many organisa- 
tions the time, money, and personnel necessary to achieve this goal are 
unavailable, but. unless judges have had extensive training and experi- 
ence, they should not: be expected to make meaningful evaluations of 
quality, particularly of a descriptive nature. Neither should a laboratory 
panel, whether small or large, experienced or inexperienced, presume to 
predict consumer acceptance or preference. Preferences of a laboratory 
. group »«* representative only of a limited and unknown portion of the 
consuming public. This concept is discussed in considerable detail in 
Chapter 9. 

II. Panel Selection and Testing Environment 
Systematic analysis of the sensory properties of foods involves the use 
of human subjects in a laboratory environment. The sensitivity and re- 
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producibility of the analytical tool (in this ease, the judge) greatly in- 
fluence the direction and validity of the results. The environment under 
which the judgments are obtained also influences the data. Of additional 
importance arc the lime and labor and the supplies and equipment in- 
volved, for these factors materially control the cost of sensory analyses. 
Wo agree with Foster (1954) that more emphasis must be placed on 
controlling physical and psychological influences in sensory testing of 
foods. Unfortunately, the data available for a wide variety of food types 
are not adequate for the determination of the optimum ranges for all 
variables. 

A. Fankc Selection 

There is considerable controversy in the literature on the value of a 
sensory panel that has been selected and trained. Much of the confusion 
has arisen because discrimination or difference tests have not been dis- 
tinguished from quality or consumer types of studies. In some eases a 
failure to find differences between trained and untrained panels in ability 
to discninitiate has had its origin in methodological or statistical defi- 
ciencies. Tarvcr and Ellis (1961) believe the following considerations arc' 
important in selecting judges for flavor-difference tests: (1) precision or 
inherent ability to duplicate a difference judgment; (2) reliability or ab- 
sence of bias in detecting a flavor difference; and (3) a tolerance level or 
inherent sensitivity to a particular flavor difference. According to Kramer 
ct al (1961), if the simulation of consumer reaction is the sole aim, a 
trained panel is not needed and should be avoided. In some cases it may 
be important to select individuals who are superior in their ability to 
detect differences. It is difficult, if not impossible, with our present lack 
of knowledge of consumer response, to select panels that will show good 
agreement with consumer evaluation. The problem seems to be our 
inability to define the difference and to train the panel to recognize the 
difference. Furthermore, the consumer uses many criteria other than sen- 
sory in evaluating foods. 

Various procedures, based on intuition, rational judgment, or experi- 
mentation, have been applied in selecting people whose performance in 
sensory tests will be superior to that of an unselected population (Daw- 
son etal, 1063). These methods have been tested with varying degrees 
of success. One major problem is the amount of pretesting work required 
to establish reliable selection. A further difficulty may be an experi- 
menter's inability to specify accurately the nature of the pane) member's 
task. "Quickie" methods of panel selection, basal upon only a few tests, 
have generally not been very .satisfactory. On the other hand, although 
the tedious process of .selecting subjects on the basis of sensitivity to the 
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basic tastes is often recommended, the method is of doubtful value 
(Mflckoy and Jones, 1954; Pcryarn, 1958). 

Since randomly selected and untrained individuals arc variable in 
their judgments, large panels arc needed for results that arts stable and 
sensitive. By selecting tbc most stable and sensitive members and train- 
ing them, one might expect to obtain a small but efficient panel Selec- 
tion in important since individuals differ considerably in sensitivity, in- 
terest, motivation, and ability to judge differences. Discriminatory skill 
need not be general; a good wine taster may not be a good judge of 
chocolates. Girardot et al (1052) found that candidates who did well on 
some products often did poorly on others. Seldom is a judge equally 
proficient in testing all qualities and all flavors of foods. The skill of a 
connoisseur has been attributed to knowledge of what signs to look for 
and how to interpret them rather than to increased sensitivity to stimuli 
(Mobsncr, 1043), An ability or aptitude for flavor assessment could con- 
ceivably vary in three ways: between individuals, between products, and 
at different times for the same individuals and products (see Coppock 
at al.y 1952; Harvey, 1953), Thus it is evident that a general-purpose 
panel will be less useful than a specific panel selected for the product 
and method being tested, A general-purpose panel could be used for 
gross screening, however, when precision must be sacrificed to save time 
and expense. A sensory panel should be considered as a tool, and, as 
such, it can be compared to suitable chemical methods (Lowe and 
Stewart, 1947). Certain methods and tools may he used to show gross 
differences, but, as the measurements needed become more refined and 
precise, the methods and tools required for accurate sensory testing be- 
come more sensitive:. 

Moscr et al (1950) considered that selection and training of judges 
on the basis of sensitivities and consistencies are of extreme importance 
in evaluating edible oils. In selecting panels, those investigators used a 
double elimination test (see Chapter 6, Section II,C) based on acuity in 
oil evaluation. In scoring bitterness in orange juice, Coote (1956) illus- 
trated the necessity of careful training and selection of panels for esti- 
mating the degree of bitterness. For beer-tasting tests, Helm and Trolle 
(1946) selected 20 out of 90 prospective judges. Those 20 had the highest 
percentages of correct selections in triangle tests and were considered to 
compose a far more suitable taste panel than the original group. Kirk- 
patrick el al (1957) showed the importance of panel selection for evalu- 
ation of milk and biscuits. 

Any method of selection should include a preliminary training period 
designed to acquaint the tasters with the quality factors involved in the 
product to be tested. This should be followed by a blind test designed to 
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show the individuals relative perception and discrimination (Harrison 
and Elder, 1950). 

13. SCttKKNING 

Most investigators employ some type of .screening process for select- 
ing panel members, including specific tests based on: (1) discriminating 
differences between solutions or substances of known chemical composi- 
tion; (2) ability to recognize flavors or odors; (3) performance in com- 
parison with other panel members; and (4) ability to discriminate differ- 
ences in .samples to be used latex in the tart. The pertinent question is 
the extent to which selection devices arc reflected in superior perform- 
ance in actual tests. 

Kramer et al (1961) reported that a single screening was insuf- 
ficient for selecting panel members of continued superior ability in de- 
tecting flavor differences. After a first screening of 28 candidates, the 12 
who performed best originally did not perform more efficiently than the 
average of the original 28 candidates. A second screening resulted in a 
more efficient group. Further screening and training would undoubtedly 
have resulted in a still more efficient pane). 

A general approach may be summarized, stepwise, as follows: (1) use 
as test materials the same product that will be tested laler; (2) prepare 
tests to obtain variations in the product similar to those which will be 
met with in the actual experiment; (3J adjust the difficulties of the test 
so that the group as a whole will discriminate between samples but some 
individuals will fail; (4) use test forms similar to those to be employed 
later; (5) start with as large a group of candidates as is feasible and with 
a selection tost that is operationally simple if more than one stage is ro^ 
quired; (6) screen on the basis of relative achievement, continuing until 
a top-ranking group of the size desired may be reliably selected; and (7) 
at each stage reject those who are obviously inadequate, but retain more 
people than will be required for the panel This procedure is not a rou- 
tine task; it requires judgment by the experimenter, particularly as to the 
criteria of achievement and as to how much data are needed for valid 
selection. According to Gimrdot et al (1952), the multiple-stage selec- 
tion assumes a good positive correlation between skills, but it will not 
be perfect. 

It is felt that a person with previous experience might utilize some of 
the skill he has developed from a knowledge of techniques. Furthermore 
he may note and detect differences which are unheeded by the inexperi- 
enced judge. lie can often describe the sensory impressions more fully 
and usually has a better understanding of the particular terminology 
employed. 
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It would, however, be impossible to test independently for all of the 
characteristics or skills which may determine nchievcrnc^t. ChnsUc 
0956) behoves it is not necessary. Various factors underline a unitary 
skill and they may be separated analytically, but in any given sensory 
test most of them will operate together. Realistic lest situations may be 
set up to include acts of discrimination and judgment such as will be 
used later in definite experiments. Such tests will give each rdevan 
factor its proper weight, so relative performance will ho an adequate 
criterion for selecting the most useful panel members. 

For selecting judges, Krum (1035) and Baker (1962) suggested that 
candidates fill out a questionnaire covering the following items; experi- 
ence, availability, age, sex, health, smoking habits, quantity of particular 
foods habitually consumed, food prejudices, and asthmatic, pbysio- 
cardiae, and respiratory behavior. It is doubtful whether this information 
will be of great value; conclusive evidence against the influence of some 
of these factors on perception has been noted in Chapters 2 ami * 
Baker's (1962) suggestion is interesting— that individuals with a physio* 
cardiac or asthmatic condition might be useful for certain panels since 
they seem to have lower thresholds for air pollutants-bnt the psychic 
attitudes of such individuals might be so unfavorable as to interfere with 

Krum (1955) wrote: "It is believed that sensory ability decreases with 
aoe and that preferences change also" Therefore, he indicated panel 
numbers should be between the ages of 20 and 50. The limiting factors 
are lack of experience in younger people and loss of perceptual ability m 
the older group. Panel members should be in good health and not physi- 
cally fatigued or worried. They should not be overly susceptible to mouth 
and sinus infections or have frequent head colds. Persons should ho ehm> 
nated who are allergic to the materials to be tested For convenience and 
more accurate judging, Krum would eliminate all who do not like or 
refuse to cat a particular product. According to Overman and Jerome 
I 1948) the members of the panel are frequently selected for their inter- 
est or their availability rather than for the acuity of their senses of taste 
and smell. In too many studios we have to "make do with the available 
subjects. 

C. Sensitivity Titfrs 

in this section we discuss the many procedures that have been em- 
ployed In general, the screening tests use discrimination between solu- 
tions of known chemical composition for taste, ability to recognize odors, 
on-the-job performance in comparison with experienced panel members, 
and ability to discriminate actual differences that will be found m the 
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samples to be used later in the tests. The experimental situation will 
dictate which, if: any, of these should be used. 

For general panel selection wo recommend that of the. Quartermaster 
group as outlined by Girardot at al (1952). In the first stage, candidates 
arc eliminated primarily on the basis of lack of sensitivity to the sensory 
attributes involved, and to a lesser extent because of poor memory, slow 
recovery from stimulation, and failure to understand the test. In the 
second stage the screening is done on the basis of ability to establish 
and use stable subjective criteria. This double testing screens out those 
who will do poorly because of lack of motivation, but it docs not identify 
in advance those who may lose interest during the course of a lengthy 
experiment. 

Threshold tests have been used as a basis of screening by many work- 
ers. This procedure is seldom justified since there is little evidence that 
sensitivity to the primary tastes is related to ability to detect differences 
m foods. AC most it is only a single factor in discriminatory ability. As 
King (1937) and Hopkins (1954) demonstrated, thresholds vary greatly 
between individuals and, except in extreme cases, no consistent relation 
can be demonstrated between taste acuity and palatability and judges* 
responses. Hall et al (1959) determined the thresholds of candidates for 
taste and flavors on two different days, and selected those sensitive to the 
lowest concentrations who could duplicate their sensitivity. Hanson, et al 
(1959) used ability to detect fijll -strength and dilute chicken broth in 
selecting a panel for studying chicken flavor. A similar approach was 
used by Tarver et al ( 1959), who determined for each judge a bitterness 
tolerance level — the recognition threshold for bitterness. Repeatability 
(or precision) must also he determined by stftfldard-to-stnndard compari- 
son. Hall et al (1959), using that procedure, found that success Jn dis- 
tinguishing the odd sample in triangular testing of beers showed a good 
correlation with the bitterness tolerance level. 

Mackcy and Jones (1954) tested 22 individuals to determine thresh- 
olds for primary tastes in water solutions and their ability to arrange a 
scries in the order of concentration. Also tested was their ability to ar- 
range, in proper order, applesauce, pumpkin, and mayonnaise containing 
different levels of these same taste constituents. Both the water solutions 
and foods could be so arranged— but the ability to arrange one properly 
was not highly correlated with the ability to arrange the other properly. 
Further, a high sensitivity did not correlate significantly with ability to 
arrange foods in order of concentration of taste substances. The varia- 
bility among the judges was high. This experiment should be repeated. 

Similar conclusions were reached by King (1937), who found no cor- 
relation between excellence in judging pure .solutions and ability to rate 
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correctly samples of bread containing various quantities of sodium chlo- 
ride, sucrose), lactic acid, and caffeine. He nevertheless suggested that the 
ability to identify the basic tastes at low concentration was valuable. 
Hopkins (1946) found a low but significant correlation between judges' 
ratings and the actual salt content of beef, Moreover, Krum (1955) also 
proposed that preliminary selection be based on sensitivity to the four 
primary tastes, From the results of such tests lie would eliminate those 
who had low sensitivity, Knowles and Johnson (1941) classified judges 
on the basis of their sensitivity to the primary tastes but found no cor- 
relation between ability to identify the primary tastes and experience in 
judging foods. Sec also repeatability estimates of Sawyer at al (1962). 

Various selection tests were given to prospective panel members by 
Pfatfrnann and Schlosberg (1952-1953), including: (1) a questionnaire 
designed to reveal habits, preferences, and interest in eating and drink- 
ing; (2) an odor recognition test consisting of 20 common odorous sub- 
stances thought to measure interest in odors; (3) a low-odor recognition 
scries approaching a threshold test; (4) a graded series of solutions to 
determine thresholds for the four primary tastes—salt, sweet, sour, and 
bitter j (5) use of the Ebberg blast-injection techniques to determine 
threshold for oil of wintergreen, to detect gross departures from normal 
sensitivity, as from nasal obstruction; and (G) sixteen duo-trio tests on 
mayonnaise and thirty on an orange drink. The Jesuits failed to reveal 
clear evidence that any item on the questionnaire predicted performance 
in flavor discrimination. Selection scores on the battery of analytical tests 
described did not correlate well, with the performance scores. The relia- 
bility coefficient (between test and vetest) and the validity coefficient 
were very low. 0 Most noticeable was the rather unstable performance of 
the panel members for short-term work. No general clear-cut panel 
ability was evident, so that prediction of a given individuals later per- 
formance would be difficult. Those workers believe, however, that predic- 
tion of the relative ability of panel members is possible. They reported 
that, with the three panels tested, the score on a single discrimination 
session indicated who would do better on later tests: those who scored in 
the upper half of the total group. It is a gross measure, however, and its 
use might eliminate some persons who would be good performers. 

° Tlx; words reliability and validity along with such terms as precision accuracy, 
And relevance fire often interpreted differently. A method oF estimation which, on 
the average, gives the true value i* called an unbiased method. Unbiased estimates 
nre sometimes termed accurate or valid. The precision of a method refers to re- 
peatability and is the ability of the method hi produce estimates winch are very 
dose together (even if it is a biased method and is not Actually measuring the true 
value). Thus accuracy (or validity) is related to lack of bias and precision to 
standard deviation. 



Discrimination \ 
to which the indivj' 
communicate this 
criminability arc: \ 
(2) the consis tones 
or difference lwjtw. 
of its complexity j 
method of commuY 
Any conclusion on 
by the investigate? 
required, Morse r 
judge to be deelar 
incuts between oq 
less than 5% of sun 

Many workers 
VI, A) tcs ts ^>r pa; 1 
for establishing Im- 
paired tests with { 
sistencv of the jul 
and Elder (1950) ! 
of three sets of pa- 
(hen ranked in do,< 
pairings, and onljj 
paired tests with i 
selection criterion, 
terns provides a m 
tivitjes can be dose 
also be used for c 
or weck-to-woek b 

The most cornn 
tcr 7, Section VI,1 
and Helm and Tr< 
known differences 
cult tCStS, Only th 
tests were used tv; 
(Girardor ci al, X; 
Simple tests wore*; 
culty. The judges § 
judgments. All jut] 
of difficulty. Only' 

Bradley 
judges. Sequential 



II. Panel Selection and Testing Environment 



285 



nm eblo- 

that the 
finable!. 
i judges' 
65) also 
(he four 
:tc those 
:1 judges 

no cor- 
ience in 
[1982). 
ibers by 
ionnairo 
d drink* 
his sub- 
ogttilion 
.lions to 
>ur, and 
.•tannine 

normal 
tests on 
:> reveal 
>rmanco 
ual tests 
ie relift- 
efRcicnt 
:iincc of 
t panel 
tor per- 

prcdic* 
cportcd 
tination 
:orcd in 

and Its 



icoumcy, 
hich, on 
?s I ironies 
s to ro- 
il rc very 
the true 
rfafon to 



Discrimination was measured by Morse (1954) in terms of tbc degree 
to which the individual or group can distinguish between two stimuli and 
eommunfente this distinction to investigators, Factors winch affect dis« 
criminability are: (1) the individual's taste acuity at the time of the test; 
(2) the consistency or stability of this ability with time; (3) the distance 
or difference between the stimuli; (4) the design of the test, especially 
of its complexity and the premium it places on memory; and (5) the 
method of communicating the results from the subject to the investigator. 
Any conclusion on diseriminahility depends on the arbitrary standard set- 
by the investigator of the number of correct versus incorrect judgments 
required. Morse required 10 correct judgments out of 12 trials for a 
judge to bo declared discriminative, reasoning that such a ratio of judg- 
ments between equal stimuli could have occurred by chance in slightly 
Joss than 5% of similar repeated trials. 

Many workers have used paired or duo-trio (Chapter 7, Section 
VI,A) tests for panel selection. Tarvcr el al (1959) used a paired test 
for establishing bitterness tolerance levels. Byer and Gray (1953) used 
paired tests with beer samples, and applied x a for determining the con- 
sistency of the judges. In selecting a panel for coffee testing, Harrison 
and Elder (1050) presented candidates with six cups of coffee consisting 
of three sets of pairs over a period of 20 to 30 days. The candidates were 
then ranked in decreasing order of their successes in making the correct 
pairings, and only the top half was used. Bliss (1900) used replicate 
paired tests with each subject. Stability of preferences was used as the 
selection criterion. Lockhart ( .1951) noted that any of the binomial sys- 
tems provides a means for rapidly selecting panel members whose sensi- 
tivities can be described in terms of probability levels. These systems can. 
also be used for cheeking the sensitivities' of the panel on a day-to-day 
or week-to-week basis. 

The most common method of choice has been the triangle test (Chap- 
ter 7, Section VI,B). It was first used by Bengtsson and Helm (1946) 
and Hehn and TroJlc (194CS) for selecting beer tasting panels. Beers of 
known differences were used first in simple tests and later in more diffi- 
cult tests. Only the most sensitive individuals wore used. Data from the 
tests were used to check panel performance. The Quartermaster group 
(Cirardot <U al, 1952) used a triangle test in the first stage of selection. 
Simple tests were used ^irst y but later the tests were of increasing diffi- 
culty. The judges were ranked on the basis of their percentages of correct 
judgments. All judges took about the same number of tests at each level 
of difficulty. Only the ranking near the cut-off point is critical 

Bradley (1955) recommended repeated triangle tests for selecting 
judges. Sequential methods (Chapter IV, Section III) can be reconv 
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mended because of their efficiency and because they focus attention on 
the risk of accepting poor judges or of rejecting good ones. Using both 
paired and triangle tests, Sehlosberg et al (1954) found that a judges 
relative performance during the first two days of testing "had a fair 
predictive value for his relative over-all performance during the follow- 
ing 20*day period,' 1 This was not true when preference for milk was 
measured, but thai result will bo discussed later (Chapter 6, Section 
TI.F). Their experience was that ability for one panel did not carry over 
to another. Honing (1.048) used the triangle test to select panels for 
distinguishing differences in flavor resulting From time and temperature 
of storage of various products. Amerine (1948) recommended it for se- 
lecting wine panels. Krum (1955) likewise used it, noting that each 
candidate should take the same number of tests. The cut-ofF point was 
determined by the number of panel members required and the precision 
required by the problem. Moscr el al (1950) found one experienced 
judge with an excellent record in testing oil but a poor record in delect- 
ing diacctyl by triangle tests. They attributed this disparity to confusion 
on the part of the subject. However, tin's judge may have been insensitive 
to low concentrations of diacetyl, even though reputed to have a keen 
sense of smell 

Dawson et al (.1963) showed thai for taste thresholds the paired 
comparison resulted in lower thresholds than thfe triangular, and that the 
single-sample procedure was the least sensitive. 

Various methods of scoring have been used jn selecting panels. 
Hcdonic scores were used by Girardot et al (1952). Similar procedures 
have been used or reviewed by Sharp et al (1936), Trout and Sharp 
(1937), Boggs and Hanson (1949), Harrison el: al (1954), and others. 
Used to evaluate performance have been average deviations between 
duplicate scores, the deviation from the score of a control sample intro- 
duced in series, or the deviations of scores between first: and second 
tastings (with the samples coded and presented in different orders). Al- 
though these measure individual reproducibility, they do not relate re- 
producibility with one sample to ability to find differences between 
unlike samples. To rectify this, the correlation coefficient between the 
first score and duplicate scores for a series of samples of varying quality 
may bo used. Bennett et al (1956) used the standard error of the means 
to measure ability to reproduce judgments. Hopkins (1946) calculated 
both correlation coefficients and regression equations to relate each 
judge's assessment to the average of the panel. A range of sensitivity was 
demonstrable and the suitability of individuals for tests could, be evalu- 
ated. The correlation coefficients were much higher for biscuits than for 
dried milk. Moscr et al (1950) likewise calculated the correlation co- 
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of individual capabilities hut docs not unsure a specified level of pro- 
ficiency. 

Kramer (1955) recommended choosing judges on the basis of their 
ability to detect differences at a given probability level. His procedure 
involved matching concentrations, and the tables he published should be 
useful whether or not duplicates arc available for all samples. 

Probably because of their extensive use in industry, control charts 
have been used in selecting panels or maintaining level of performance. 
A control chart is a statistical device used principally for the study and 
control of repetitive processes. Such charts arc bated on the theory thai 
variations due to chance occur in a random pattern and that the fre- 
quencies approach those of the binomial distribution. To see whether a 
process is out of control, past data arc plotted on a control chart. If the 
data conform to a pattern of random variation within the control limits, 
the process will be judged as being in control Reliability is indicated by 
the narrowness of spread between control limits. Since pre-established 
standards can he set up, the control chart also measures the validity of 

nlS) g ° S r ° SUlLS ' ^ ba * SiC ***** Foig<mbawm < 1951 > cmd D «»cnn 
Control charts have been recommended by Marcusc (1945 1947) 
Moor 6* al (19S0), Harrison and Elder (1950), Krum (1955), Coole 
(1956), and Tarver and Ellis (19(51). With them, not only an individuals 
performance but that of an entire panel can be held to a given precision 
Harrison el al (1954) defined the efficiency of a panel in terms of the 
probability of the panel's acceptance of definite differences in the 
samples. To eliminate the number of correct selections through chance 
alone, the scores were corrected with the following formula: 

_ IOO(/ i - O 

where S is the percent score corrected for chance expectation, R the raw 
percent score, and C the percent score expected by chance. 

More elaborate mathematical procedures may be used in certain 
cases: multiple-factor analysis, item analysis, discriminate functions, 
product-moment correlation coefficients (Filipcllo, 3957), etc. 

In most cases a simple test using some binomial procedure may be 
used to eliminate insensitive judges. See Amerine et al (1959) for de- 
tailed procedures used for wine panels. Analysis of variance or some 
sequential procedure should be used for more complex situations or to 
maintain the panel at some desired level of performance. 

Variation among 30 judges in scoring scrambled eggs containing vari- 
ous amounts of added primary-taste compounds was "described by Hop. 
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kins (1946). Significant variation (p 0.01) was observed among judges, 
Some statistically significant discrimination among groups oC samples 
containing different tost substances and among concentrations of those 
substances was also found. Individual scores became progressively more 
erratic as quality deteriorated. Hopkins concluded that no consistent 
relation between taste acuity alone and payability judgments should bo 
anticipated. Quality evaluation includes visual, olfactory, and tactile 
sensations as well as taste sensitivity, and is further conditioned by the 
scoring methods used and the experience and frame of reference of the 
judges (see also Chapter 8). 

Sensitivity to taste or odor appears to he only one factor influencing 
discrimination. In most caws, elaborate tests based on acuity seem wn- 
necessary since absolute sensitivity to the basic tastes is not dosehj re- 
Med to perceptual skills 

D. Panel Size 

The number of judges needed in a given experiment will vary ac- 
cording to the variabilities of the individuals and of the product. A pre- 
liminary experiment will give information from which can be calculated 
the number of judges necessary to secure a given level of statistical 
significance. As quality decreases, variability among judges increases and 
panel size must be increased t<) obtain differences which are statis- 
tically significant (Boggs ami Hanson, 3949; Kefford and Christie, I960). 
A good example of this is found in work by Hopkins (1946, J947) with 
biscuits, dried eggs, butter, dried milk, and bacon. He noted that, at low 
levels of acceptability, discrimination was very erratic, so that more 
judges were required for significance in results. Not enough information, 
is available, however, on the interrelationship of acceptability and dis- 
crimination. In incomplete-block studies, Hanson at al (J 95.1) found, 
surprisingly, that the error of the panel means was greater for samples of 
intermediate quality. 

Of course, the panels must be much larger in preference testing than 
in difference testing. Hopkins (1947) concluded that, with bacon vary- 
ing in degree of saltiness, panels of 35 judges were necessary to discrimi- 
nate sensory differences of 5% with intmpnnel comparisons. For inter- 
panel comparisons, 02 judges would be necessary. Gimrdot at al (1952) 
preferred panels of 30 to 90 in food-development studies. Bengtsson and 
Helm (1946) preferred large panels (50 to 100) in testing for differences 
which might influence future work. For routine control, 10 to 30 judges 
were believed adequate. Krum (1955) found panels of 10 to 30 sufficient. 
When only three or four individuals were available he believed it pos- 
sible to repeat the tests enough times to get a suitable number of results. 
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TABLE 4 

HULKS l } OR SCAU: DHSICiN _ .^^|- . 

rAn^alc *rid u7ro.«7^ wordTIg: ^Wdislikc. no not 

b KSry'S.t PoinSu»a ,o modified with modifiers of the root such as vcry^ ; 

o. S 5.1 ASRSS^ - in < hc of 100 many sca,c 1 

points. 



develop his/her own scales only rarely! It is preferable and safer to use-, 
scX which you have used previously with demonstrated success (defined , 
sta 5.caHy) or which have been used and demonstrated by others. The ', 
mo wc 1 known scale in rood research is the nincpo.nl hedonte scale , 
Si r developed by the US Army in the 1940s. It is intcrcsung to note : 
hat thi scale satisfies the five points mentioned above: il is adequately long : 
(nine DotatO it Pousses a neutral point, it uses one root word 
[Ske dSS' tl'uscs -he same modifiers above and below neutral • 

of ~ (Table 0) 

.nd amount (Table 7), both key concepts in Pood attitude research These 
d u Zvide guidelines for selecting four-nine point category scales for 
teSmcepts The four-point scale of frequency (never, somet.mes. often 

TABLE 5 

Till- NINKM'OINT IIICOONIf SCAMi USIil) FOR FOOD 
ACCr-l'TANO! AND FOOD MtKin'.RKNCi: 



'if A ' 



9 Like extremely 

8 Like very much 

7 Like moderately 

6 L'^c slightly 

S Neither like nor dislike 

4 Dislike slightly 

3 Dislike moderately 

2 Dislike very much 

I Dislike extremely 
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In some administrations it was extended to *cj 
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For most purposes it would appear that items cqj 
would be insignificant, unless very specialist*) 
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The QMFCI scale also listed the frequency pjf. 
categories. The category 'every three month: 
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,u c left-hand side of the paper. The question raised was 'Should the scale 
J w b thc'dislike' or'Hkc' end of the rating scaler. They found, » . .** 
2? list of 45 food items, that the proportion of answers in each o he nine 
categories was almost identical (correlation coefficient . 0-96 . However 
5S were significant differences between the form ol the scale m which 
Sec extremely was placed on the extreme left and the one ,n which like 
Z cmel/ was placed in that position. Beginning the scale with 'dislike 
cx rcmei/ led to a significantly greater frequency of the 'dislike' categories. 
Beg nning the scale with 'like extreme!/ did not produce the analogous 
fnercased frequency for 'like' categories. In practical terms the effects are 
ly small. The correlation between the 45 pairs of food means was 0 997. 
and it is the mean which is used for predictive purposes with these data The 
Searchers suggested that the scale should begin with 'like extremely but 
hastened to add that no clear problem resulted from the reverse. 

The issue of preference frequency has been another focus of food 
preference scaling and has been phrased in a variety o ways: How often 
would you like to eat the menu items? , 'How often would you be wdl.ng to 
cat the item?', 'How often would you like to see the food offered 1 . . .. 
'How often would you like to cat the item?'. 

Preference frequency scales have been of two types, one using verba 
categories of frequency and the other using quantitative categories (Table 
8) Almost all frequency scales used have had four or nine categories. 1 he 
verbal-based scales have depended heavily on the existing temporal system 
of dav week and month Two scales have used the term 'often' m addition 
toJayiw^ 

this could represent difficulties in trying to translate into actual tempora 
units Benson (1958) also used a four-category scale but stuck to temporal 
terms (once a day, week, month, year). Hartmuller (1971) and Kn.ekrehm 
ei at (1967) used identical ninocalcgory scales from 'twice a day 10 once a 
month' (plus 'never want'). The QMFCI research on frequency scales also 
used a nine-category scale which overlapped greatly w,th the one jus I cited 
In some administrations it was extended to 'every three months and to 
•once a year'. The question which arises then is: "What « the most 
appropriate time frame for the preference frequency scale ? his quo ion 
has not been directly addressed, most scales using the month as the unit. 
For most purposes it would appear that items consumed only once per year 
would be insignificant, unless very specialised food services (class A 
restaurants, catering) were of interest. k<1 | e ,.,i 0 
The QMFCI scale also listed the frequency per month ol all verbal scale 
categories. The category 'every three months was rated 0-3 and the 
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TABLE 8 

SCALES OF IW:IKlOtHD FRKQUhNCY 
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category 4 oncc a year' was rated ()■ I . This reinforces the use of the month as 
the unit. It also provides both the test respondent and the researcher with a 
quantified scale for analysis and prediction. In some cases (e.g. Knickrchm 
1 967) subjects responded on the frequency scale by listing the number 
of the verbal category. For example, twice a week was coded as 4, The 
potential problem here is that the best respondent is not using the actual 
frequency statement in his answer, whereas in other scales he is. 

A preference scale (Fig, 1 ) developed more recently for the military used 
a quantitative preference frequency scale based on the week and month 
(Meisclman et «/., 1972). The subject was asked how often he would like an 
item in terms of days per week (answer 1, 2, 3, 4, 5, 6 or 7) and weeks per 
month (answer 1 , 2. 3 or 4). While this docs directly ask the preference 
frequency question in quantitative terms, it forces the subject into a 
week month system. If he wants squash 1 3 limes per month, he cannot so 
indicate. Further il assumes that the weekly pattern is repeated. This is also 
the case in some verbal categories scales. A more recently developed survey 
(Fig. 2) (Meisclman and Waterman, 1 978) avoids weekly units and asks for 
preference frequency per month using a scale which permits coding of any 
number from 0 to 31 (actually 39 is possible) days per month. Note again 
that the monthly unit was the unit of choice. 
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The numerical and verbal scales possibly reduce to the same thing when 
the subject is using numbers in the numerical scale and using the verbal 
categories in the non-numcrieal scale. When the subject uses numerical 
codes for the verbal scale categories, problems can arise. The focus of his 
attention is then on a number which does not directly represent frequency. 
He then begins to use the category scale of numbers without necessarily 
referring them to their refcrrcnt frequencies. This is similar to what can 
happen in the hedonic acceptance scale in which one begins to use a number 
without realising its refcrrcnt (extremely good, very bad, etc.). 

One potential advantage of certain quantitative scales of frequency is 
that they can be ratio scales, that is, scales with equal intervals and a zero 
point. Ratio scales permit statements of ratios so thai one could say .v is 
preferred twice as often as >\ etc. The frequency scale developed by US 
Army Natick Laboratories (Mcisclman and Waterman, 1978) is such a 
scale (from 0 to 39). Both the old QMFCI scale and the scale used by 
Mcisclman et al. (1972) arc not continuous scries of numbers; hence the 
subject is selecting categories rather than dealing in ratios. 

The scales discussed so far have been cither hedonic scales or preference 
frequency scales. SchuU ( 1 965) developed a food action ratingscalc (FACT 
Scale), by scaling IK action statements representing affective attitudes 
towards foods. Nine were selected to give equal intervals. The standard 
deviation and mean of the FACT scale and the nine-point hedonic scale are 
very similar; the two scales correlate 0-97 Tor food means. The overall 
tendency for the FACT' means to be lower than the hedonic means 
apparently results from slightly lower FACT ratings for desserts and 
semisolid and liquid foods. 

Van Ritcr (1956) used a scale based on home use of foods (specifically 
vegetables) including scale categories: l nevcr served at home', 'one or more 
of my family dislike the food*, and "prepared differently at home*. These 
categories arc indicators of factors thai arc possibly important in food 
preference determination. Whether they arc good measures of the 
preferences themselves is unclear without a more complete evaluation. 

2.4. Examples of Food Preference Data 

Although a large amount of food preference data is collected by various 
institutions and commercial organisations, liltlc of it reaches the open 
literature. However, there is a growing body of data for the investigator to 
tap so that many food preference decisions need not be made intuitively. 
One of the largest of available data bases is that of the United States Armed 
Forces which have been collecting food preference data for almost 40 years. 
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role in the filling algorithm. Following Carroll (1972), it is necessary to 
distinguish two modes of analysis. In internal analysis the objective is to 
achieve a consensus configuration of the stimuli based .solely on the 
preference data. In external analysis the aim is to relate preferences to 
physicochcmical measurements using as parsimonious a model as possible 
to lake account of individual differences in scoring patterns. 

3.3.2. Internet Analyses 

The simplest approach to modelling individual differences in preference is 
the vector model proposed by Tucker (I960), The set of stimulus points arc 
embedded in a multidimensional space and each subject is represented by a 
vector in the space. The ordering of the projections of the stimulus points 
on to the vector gives the preference ranking of that subject. The cosine of 
the angle that a vector makes with the dimensions of the space is considered 
to be proportional to the relative importance of that dimension in the 
preference judgement. 

An example from our own experience demonstrates the use of the vector 
model very effectively. The data (unpublished) was generated at the Torry 
Research Station, Aberdeen and we arc grateful to P. Howgale for 
permission to use it. In this study, 48 subjects were asked to rate six types of 
fish or fish product on an hedonic (Pcryam/Pilgrim) scale: I -dislike 
extremely, 9 = like extremely. For brevity Table 4 shows the session means 
for only six subjects, A-F. The complete data was input to the MDPREF 
program (Chang and Carroll, 1968), and the two-dimensional solution, 
which accounts for 85-3 % of the variation, appears in Fig. 9. The subjects 
appear as points on the unit circle and a preference ranking is obtained by 
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standard normal r.v. 75. thrce ot h er probability dis- 

Stan m statistical t-ta?^^ .^StSS^TS Jw^f—O?) «B«ribi.iio«w 
uibuUom.Thcyj arecalUri h < *-U«J»* dilutions depend on only one 
and the F dfatnbuUon ^J^and x ^ ^ ^ 

parameter, whereas the fJ^Xutions are called the degrees of freedom 
nology, the parameters of t^JJgJS J«o parameters are identified as 
(DF) parameters. For the /' f ^ "V°" t ^« np The percentiles of the t 

?e "numerator" and ^^^t^n J^ 

distribution are gtven « rable ML As the u dislribuUoa Thal IS for 

l Sa?diUtion — 
K denominator OF are given In ^Tablc A-4. ^ 

that 

l// ; «.,.,.«, = F, ..,.„,.„,. 
Thetcareothcrprobabmtyd^bu^ 

E "Led, the population p = tc- ^ ^ 
to be estimated by ap *^J2^ q £«d to estimate population 
example, statistic * the samp* nean, may be ^ 
mean * Also S 2 , the sample vanamA can J u cr> „ ^d an 

variance * ». A static, when t is used I o «um« P ^ Ung 

oMcr. Since an estimator « ^^^^^ an estimator 
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the extent of sampling vanation ^JJ^^ 1 } lh cn 0 is -Hod 
estimating is such that the ^.^^SSn of the distribution of $ 
an MbM estimator. fttf). I" this notation, an 

is also called the expend va «c of 0 and ^Jgjj^^ mcan * and varla „cc 

estimator 0 offf is unbiased if h{Q) l\™™tion J*n » and variance c\ 
* -e unbind estimato-o he pom at.on mc*^ ^ 

respectively. That is. EX) " ^'^^Jtheoopolalion standard deviation 
deviation S is not an unbased g^J^*^ 5n addili on to the 
a. In our notation, /:(&) * B » l tn " . whjch mav bc usc d in deciding 
unbiascdncss, such as consistency ^^^J^^ details because 
•PPi^^'^^^JS «sc only the statistically estab- 

An estimator when used to estimate a ^ estimfll «, 

an interval, for example, X - ^ *J • H ,or ' f an imcrval cSlim atc 

the sample standard devia * on '^^; But we can ask; How 

may or may not contain the ^ J"™ft wntai J lh c true value of 

,„K/.r* v -= S7 /n is an estimate of the stand- 
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probability distribution of Y is the fono *'«* - i Suppose that wc can 
P Consider a population with mean , compute X 

list all possible random »™^XE^^S^ 1 - , 
foreachsamplcthusgencrati. g h^'buuonw of ^ for M 
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Testing of Hypotheses 

Another area of statistical tnfer< 
theses. The testing of hypotheses con: 

1. Formulation of hypotheses 

2. Collection, analysis of data, 

3. Specification of a decision ru',; 
or rejecting hypotheses | 

The formulation of the hypo* 
proposed experiment. For instance 
whether a process modification b 
researcher proceeds by producing q 
modified process. Let ,i denote Inc.: 
while ,io, the mean texture of .Uv 
hypothesis is written as 

Ho: 

which states that on the average t! 
modified process. The alternative h 

H, 

which states that there is a change 
process. The alternative H„ in (U .■ 
may be less than /.*<> (/« < fo) 'i 
alternative hypothesis is cither j 

The formulation of a one-sided h! 

of the experimenter. ; 
If, instead of in the mean. I 

similarly formulate null and alter 

alternative hypotheses have been 
used to develop statistical dccisu 
Since a statistical decision 
account for sampling variations 
of the null hypothesis on the ba 
rejection of the null hypothesis 
the probability of which ts d« 
B = 0.05 indicates that on the f 
hypothesis 5 times in 100 cases, 
the statistical test; values of 0.t 
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